LoongArch64 LSX fast-path for `str.contains(&str)` #144393

heiher · 2025-07-24T12:03:20Z

Benchmark results with LLVM 21 on LA664:

OLD:
test bench_is_contained_in ... bench:          43.63 ns/iter (+/- 0.04)

NEW:
test bench_is_contained_in ... bench:          12.81 ns/iter (+/- 0.01)

rustbot · 2025-07-24T12:03:25Z

r? @tgross35

rustbot has assigned @tgross35.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

library/core/src/str/pattern.rs

Benchmark results with LLVM 21 on LA664: ``` OLD: test bench_is_contained_in ... bench: 43.63 ns/iter (+/- 0.04) NEW: test bench_is_contained_in ... bench: 12.81 ns/iter (+/- 0.01) ```

heiher · 2025-07-29T14:55:22Z

@bors r=tgross35

bors · 2025-07-29T14:55:26Z

@heiher: 🔑 Insufficient privileges: Not in reviewers

tgross35 · 2025-07-29T18:26:26Z

Thanks!

@bors r+

bors · 2025-07-29T18:26:29Z

📌 Commit 1ceacf5 has been approved by tgross35

It is now in the queue for this repository.

bors · 2025-07-29T20:44:36Z

⌛ Testing commit 1ceacf5 with merge ba7e63b...

bors · 2025-07-29T23:51:57Z

☀️ Test successful - checks-actions
Approved by: tgross35
Pushing ba7e63b to master...

github-actions · 2025-07-29T23:55:09Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 686bc1c (parent) -> ba7e63b (this PR)

Test differences

Show 2 test diffs

2 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard ba7e63b63871a429533c189adbfb1d9a6337e000 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

x86_64-apple-1: 6947.6s -> 10441.2s (50.3%)
dist-x86_64-apple: 7315.4s -> 10255.6s (40.2%)
dist-aarch64-apple: 7478.4s -> 5085.3s (-32.0%)
aarch64-msvc-1: 6960.6s -> 9107.5s (30.8%)
x86_64-gnu-distcheck: 8550.9s -> 7795.3s (-8.8%)
aarch64-apple: 4886.7s -> 5221.5s (6.9%)
tidy: 107.5s -> 114.4s (6.4%)
dist-riscv64-linux: 4780.5s -> 5080.3s (6.3%)
x86_64-apple-2: 3684.0s -> 3471.8s (-5.8%)
dist-aarch64-linux: 5635.6s -> 5899.6s (4.7%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-07-30T01:03:41Z

Finished benchmarking commit (ba7e63b): comparison URL.

Overall result: ❌ regressions - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.9%	[0.9%, 1.0%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary 5.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	5.6%	[5.6%, 5.6%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	5.6%	[5.6%, 5.6%]	1

Cycles

Results (secondary -2.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.7%	[-3.1%, -2.3%]	2
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 469.682s -> 469.063s (-0.13%)
Artifact size: 376.80 MiB -> 376.81 MiB (0.00%)

lqd · 2025-07-30T05:56:18Z

match-stress noise

@rustbot label: +perf-regression-triaged

LoongArch64 LSX fast-path for `str.contains(&str)` Benchmark results with LLVM 21 on LA664: ``` OLD: test bench_is_contained_in ... bench: 43.63 ns/iter (+/- 0.04) NEW: test bench_is_contained_in ... bench: 12.81 ns/iter (+/- 0.01) ```

rustbot assigned tgross35 Jul 24, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 24, 2025

tgross35 reviewed Jul 29, 2025

View reviewed changes

library/core/src/str/pattern.rs Show resolved Hide resolved

heiher force-pushed the str-contains-lsx branch from 1e2fd99 to 1ceacf5 Compare July 29, 2025 13:37

LoongArch64 LSX fast-path for str.contains(&str)

1ceacf5

Benchmark results with LLVM 21 on LA664: ``` OLD: test bench_is_contained_in ... bench: 43.63 ns/iter (+/- 0.04) NEW: test bench_is_contained_in ... bench: 12.81 ns/iter (+/- 0.01) ```

tgross35 approved these changes Jul 29, 2025

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 29, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 29, 2025

bors merged commit ba7e63b into rust-lang:master Jul 29, 2025
11 checks passed

rustbot added this to the 1.90.0 milestone Jul 29, 2025

bors mentioned this pull request Jul 30, 2025

byte_pattern: share the TwoWaySearcher between byte and str #135931

Open

rustbot added the perf-regression Performance regression. label Jul 30, 2025

heiher deleted the str-contains-lsx branch July 30, 2025 05:05

rustbot added the perf-regression-triaged The performance regression has been triaged. label Jul 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LoongArch64 LSX fast-path for `str.contains(&str)` #144393

LoongArch64 LSX fast-path for `str.contains(&str)` #144393

heiher commented Jul 24, 2025

Uh oh!

rustbot commented Jul 24, 2025

Uh oh!

Uh oh!

heiher commented Jul 29, 2025

Uh oh!

bors commented Jul 29, 2025

Uh oh!

tgross35 commented Jul 29, 2025

Uh oh!

bors commented Jul 29, 2025

Uh oh!

bors commented Jul 29, 2025

Uh oh!

bors commented Jul 29, 2025

Uh oh!

Uh oh!

github-actions bot commented Jul 29, 2025

Uh oh!

rust-timer commented Jul 30, 2025

Uh oh!

lqd commented Jul 30, 2025

Uh oh!

Uh oh!

LoongArch64 LSX fast-path for str.contains(&str) #144393

LoongArch64 LSX fast-path for str.contains(&str) #144393

Conversation

heiher commented Jul 24, 2025

Uh oh!

rustbot commented Jul 24, 2025

Uh oh!

Uh oh!

heiher commented Jul 29, 2025

Uh oh!

bors commented Jul 29, 2025

Uh oh!

tgross35 commented Jul 29, 2025

Uh oh!

bors commented Jul 29, 2025

Uh oh!

bors commented Jul 29, 2025

Uh oh!

bors commented Jul 29, 2025

Uh oh!

Uh oh!

github-actions bot commented Jul 29, 2025

Test differences

Job duration changes

Uh oh!

rust-timer commented Jul 30, 2025

Overall result: ❌ regressions - please read the text below

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Uh oh!

lqd commented Jul 30, 2025

Uh oh!

Uh oh!

LoongArch64 LSX fast-path for `str.contains(&str)` #144393

LoongArch64 LSX fast-path for `str.contains(&str)` #144393